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The modification-dependent restriction endonuclease AspBHI recognizes 5-methylcytosine (5mC) in the 
double-strand DNA sequence context of (C/T)(C/G)(5mC)N(C/G) (N — any nucleotide) and cleaves the two 
strands a fixed distance (Nja/Nis) 3' to the modified cytosine. We determined the crystal structure of the 
homo-tetrameric AspBHI. Each subunit of the protein comprises two domains: an N-terminal 
DNA-recognition domain and a C-terminal DNA cleavage domain. The N-terminal domain is structurally 
similar to the eukaryotic SET and RING-associated (SRA) domain, which is known to bind to a 
hemi-methylated CpG dinucleotide. The C-terminal domain is structurally similar to classic Type II 
restriction enzymes and contains the endonuclease catalytic-site motif of DX20EAK. To understand how 
specific amino acids affect AspBHI recognition preference, we generated a homology model of the 
AspBHI-DNA complex, and probed the importance of individual amino acids by mutagenesis. Ser41 and 
Arg42 are predicted to be located in the DNA minor groove 5' to the modified cytosine. Substitution of 
Ser41 with alanine (S41A) and cysteine (S41C) resulted in mutants with altered cleavage activity. AH 19 
Arg42 variants resulted in loss of endonuclease activity. 



Mammalian DNA cytosine methylation is an important epigenetic modification'. It remains unclear how 
cytosine methylation within particular sequences is initiated, maintained and particularly, recognized. 
Epigenetic DNA modification is dynamic, and differences are found in the epigenomes of cells during 
normal development^, aging and mental health, and during pathologic processes such as cancer, among many 
others^. To learn more about the role of epigenetic modification in development and disease, and to understand 
the mechanisms that control its locations and levels in the human genome, the genomic locations of modified 
cytosines must be mapped with accuracy, to single-base resolution. Newly identified 'modification-dependent' 
restriction endonucleases are proving useful for this purpose'*'"' and for understanding how specific recognition of 
modified cytosine occurs. 

AspBHI from Azoarcus sp. BH72 belongs to a family of modification-dependent restriction endonucleases that 
recognize 5-methylcytosine (5mC) in the context of specific DNA sequences and cleave N12/N16 3' downstream 
of the modified cytosine'*''. These proteins vary in length from 388 amino acids (AspBHI) to 456 (MspJI), and 
include a conserved core region of —390 amino acids (Fig. la). FspEI has an additional amino-terminal 50 amino 
acids not present in other family members, whereas MspJI has insertions in multiple locations'. Besides MspJI, the 
other family members share sequence conservation throughout the entire region, with invariant (—26%) or 
conservatively substituted positions (—30%) scattered throughout the conserved core (Fig. lb). Only one inser- 
tion of six residues was found in the conserved core of LpnPI (residues 316-321). 

Previously we reported the tetrameric structure of MspJI which recognizes (5mC)NN(G/A)'. Here we report 
the structure of AspBHI which recognizes (C/T)(C/G)(5mC)N(C/G)'' and we confirm that it also forms a 
tetramer. To understand how specific amino acids of AspBHI determine its substrate recognition preference, 
we generated a homology model of the AspBHI-DNA complex, and probed the importance of a number of 
individual amino acids by mutagenesis. 
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Figure 1 | AspBHI is a member of MspJI family, (a) Schematic representation of AspBHI and members of MspJI family. The conserved region is shown 
in dark grey and insertions are shown in open boxes, (b) Sequence alignment of AspBHI and members of MspJI family. The AspBHI residue numbering is 
shown above the sequence alignment. The pairwise comparison of AspBHI and MspJI was shown previously'. Amino acids highlighted are either 
invariant (white against black) among the five proteins or similar (white against grey) as defined by the following groupings: V, L, I, and M; F, Y, and W; K 
and R, E and D; Q and N; E and Q; D and N; S and T; and A, G, and P. Helices are labeled aA-c/M; strands are labeled pi-pi5 (strand (38 is subdivided into 
pSi and P82 owing to a discontinuity in this strand), (c) Distribution of averaged crystallographic thermal B factor per residue. 



Results 

Tetrameric form of AspBHI. We determined the structure of 
AspBHI at the resolution of 2.8 A (Table I). Like MspJI^ AspBHI 
is assembled into a tetramer, formed by molecules A, B, C, and D 
(Fig. 2a-b). Molecules A and B form a closed dimer with high quality 
electron densities observed for all 388 residues. Interestingly, 
molecules C and D have an intact N-terminal (DNA-recognition) 
domain up to Pro2I6, but the entire C-terminal (DNA-cleavage) 
domain could not be traced due to discontinuous residual 
densities. We inferred the general location of the C-terminal 
domains of molecules C and D by comparison with those of MspJI 
(Fig. 2c), and found them to be in a void along the crystallographic 6- 
fold axis with a diameter of 100 A (Fig. 2d). Absence of crystal 



packing forces may allow the C-terminal domains of molecules C 
and D to be mobile and thus unobservable. Analytical gel-filtration 
measurement confirmed that AspBHI exists as a tetramer in solution 
(Fig. 2e). An "invisible" domain in a protein crystal structure is not a 
common occurrence, but several examples have been observed** '". In 
these structures, as in ours, a large space is found where a domain 
connected to another by a linker can move as a rigid body owing to 
the absence of any intra-molecular or inter-molecular crystal- 
packing interactions. 

Monomeric AspBHI structure. Focusing on molecules A and B, the 
monomeric AspBHI contains two domains, connected by a 10- 
residue linker (residues 212 to 221) including residue Pro216 
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Table 1 | Summary of Diffraction and refinement statistics of AspBHI crystals 
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(Fig. 2f). Among the family members, AspBHI is the smallest in 
length (388 residues), while MspJI is the largest (456 residues) 
(Fig. la). Superimposing the AspBHI and MspJI structures 
revealed that MspJI has seven insertions of five to eight residues in 
the N-terminal DNA binding domain, mostly in the loops, and a 15- 
residue extension at the C-terminus (Fig. la)^. One interesting 
difference lies in the 20-residue-long curved strand P8 in MspJI, 
where AspBHI has an 8-residue insertion that breaks the strand 
into two parts (Fig. 2g). The insertion includes a 3io helix that 
protrudes into the C-terminal helix bundle of molecule B (Fig. 2h). 
The main chain carbonyl oxygen of Ser368 of molecule B forms a 
hydrogen bond with the main chain amide nitrogen of Ala 149 of 
molecule A, connecting helix aL of molecule B with the 3io helix of 
molecule A (Fig. 2h). 

A model of the N-terminal SRA-Iike DNA-binding domain in 
complex with DNA. Like MspJF, the N-terminal domain of 
AspBHI is structurally similar to the eukaryotic SET and RING- 
associated (SRA) domain of UHRFl (Fig. 3a-b), which binds to 
hemi-methylated 5mCpG dinucleotide sequences"'^. The C- 
terminal domain of AspBHI is structurally similar to several 
prokaryotic Type II endonucleases (Fig. 3c-d). We created a model 
of the AspBHI N-terminal SRA-like domain bound to DNA, using 
the coordinates of the mouse SRA-DNA complex". After 
superimposing the protein components, the bound DNA was 
positioned over the mostly basic surface of AspBHI except for an 



apparent acidic pocket. An equivalent pocket is present in the SRA- 
DNA complex where it forms the binding site for the methylated 
cytosine, which is flipped out from the DNA helix (Fig. 3b). The 
flipped 5mC models accurately into the AspBHI pocket, in a 
position to interact with Asp71 via two hydrogen bonds and Tyr82 
via planar stacking contact. Asp71 is part of the loop between strand 
(34 and (35 and the last residue prior to strand (35. Tyr82 is part of the 
strand P6, which is anti-parallel to strand ps and is positioned 
alongside Asp71. These two amino acids are conserved among the 
AspBHI family enzymes (Fig. la) and also among known SRA 
domains'^', where Asp474 and Tyr483 of mouse UHRFl interact 
with the flipped 5mC in the same way. The methyl group of 5mC 
interacts with the Ca and CP atoms of Ser486 in UHRFl (Fig. 3b)", 
and likely does the same with Asp85 of AspBHI, the side chain of 
which points away from the binding pocket (Fig. 3b). Mutating 
Asp71, Tyr82 or Asp85 to alanine abolished AspBHI activity 
(Fig. 4, lanes 9-11), indicating that these residues are essential for 
binding the flipped 5mC nucleotide, for subsequent endonuclease 
catalysis, or for both. 

In order to hydrogen bond with the ring atom N3 and the exo- 
cyclic amino group N4 (NH2) of the flipped 5mC (Fig. 3b), the side 
chain carboxylate group of Asp71 must be in the protonated state, 
even though the pKa of this group in solution (3.9) is well below the 
pH (7.9) at which the enzyme is active. The same must be true for 
Asp474 of UHRFl, and also for the conserved binding pocket glu- 
tamate of motif V ('ENV') of the SmC-methyltransferases'"' " which 
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Figure 2 | Structure of AspBHI. (a) Four AspBHI monomers, A, B, C and D, form a tetramer. Molecules C and D have mobile C-terminal domains 
(indicated by a circle), (b) AspBHI tetramer, rotated —90° from the view of panel (a), (c) For comparison, MspJI has an intact tetramer showing in a 
similar orientation of panel (a) . (d) The disordered C-terminal domains of molecules C and D of AspBHI tetramer were located in the void space along the 
crystallographic 6-fold axis with a diameter of 100 A. (e) Elution profile of AspBHI on Superdex 200™10/300 GL (GE Healthcare). The column buffer 
was 20 mM Tris-HCl (pH 7.5), 300 mM NaCl and 1 mM DTT, and 150 ng of AspBHI was loaded onto the column. The inset shows the standardization 
of the size exclusion column using a Gel Filtration Markers Kit for Protein Molecular Weights (SIGMA-ALDRICH, Cat. No. MWGFIOOO) at the 
time AspBHI was profiled using the same buffer, (f) Monomeric AspBHI contains two domains connected by a linker, (g) AspBHI has a discontinuity in 
strand (58 owing to the insertion of a 3io helix (right panel), whereas MspJI has a corresponding 20-residue-long curved strand ps (left panel). Pairwise 
sequence alignment is shown above the panels, (h) The 3io helix of molecule A is involved in the dimer interface with the C-terminal helix aL of molecule 
B. The amino end of the 3io helix (Alal49 of molecule A) interacts with the carboxyl end of helix aL (Ser368 of molecule B). Arrows indicate helical 
dipoles. 
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Figure 3 | A model of AspBHI in complex with DNA. (a) Superimposition of the AspBHI N-terminal domain (in green) with the SRA domain of mouse 
UHRFl (in yellow; PDB 3FDE). (b) The flipped 5mC nucleotide can be dockedinto the binding pocket of AspBHI. (c) Superimposition of the AspBHI C- 
terminal endonuclease domain (in green) and the Hindlll-DNA complex (conserved secondary elements in yellow and additional in grey) (PDB 2E52). 
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of conserved Asp282 in AspBHI, pointing away from the active site, might undergo conformational change upon DNA binding, (e) A model of the 
AspBHI N-terminal domain docked with a DNA (taken from PDB 3FDE) containing a flipped 5mC (which is faded in the background). The opposite 
guanine is labeled. The Loop-B3 occupies the DNA minor groove 5' to the 5mC, while the Loop-2B occupies the minor groove 3' to the 5mC. 



likewise hydrogen bonds with the flipped substrate cytosine prepar- 
atory to methyl transfer. 

Our model of the AspBHI N-terminal domain bound to DNA, 
derived from the UHRFl SRA-DNA complex, suggests that three 
loops (Loops 2B, B3 and 6C) might intrude into the DNA minor or 
major grooves (Fig. 3a and 3e) and provide the interactions needed 
for AspBHI to recognize its DNA substrate sequence. Loop-2B (resi- 
dues 23-31 between strand [52 and helix cxB) could make base-spe- 
cific contacts in the minor groove on the 3' side of the flipped 5mC, 
where N(C/G) is recognized, and Loop-B3 (residues 39-43 between 
helix aB and strand P3) could make base-specific contacts in the 
minor groove on the 5' side where (T/C)(C/G) is recognized. 
Loop-2B is unique to AspBHI in sequence among the family mem- 
bers (Fig. lb) as well as in length compared with UHRFl. The cor- 
responding loop in UHRFl is a one-residue sharp turn". Alanine 
mutations of potential contact residues within Loop-2B were con- 
structed and tested. K24A and R27A cleaved phage DNA similarly to 



WT AspBHI (Fig. 4, lanes 2 and 4), but plasmid digestion was some- 
what reduced, especially for K24A. T25A and D32A [Asp32 is an 
invariant residue within the family. Fig. lb] abolished cleavage activ- 
ity altogether (Fig. 4, lanes 3 and 5). 

Loop-B3 contains Ser41 and Arg42 that are unique to AspBHI 
(Fig. lb). The corresponding loop in UHRFl also approaches the 
DNA from the minor groove and contains Val451, which occupies 
the space left behind by the flipped 5mC, and His450, which interacts 
with the 5' base pair". To examine the effects of Loop-B3 mutations, 
we changed Ser41 and Arg42 to all 19 other amino acids (the results 
are discussed below). The third loop, Loop-6C is between strand [36 
and helix cxC (residues 84-99). The corresponding loop in UHRFl 
contains Arg496, which hydrogen bonds from the major groove with 
the intra-helical orphaned guanine (Fig. 3a)". Loop-6C is sis-residue 
shorter than its UHRFl counterpart, and it adopts a different con- 
formation due perhaps to the absence of DNA (Fig. 3a), making it too 
short to reach the DNA major groove in the current model. 
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Figure 4 | AspBHI variants and activity assays on modified plasmid and phage DNA substrates, (a) SDS-PAGE analysis of partially purified His-tagged 
AspBHI WT and its variants after nickel-chelated affinity chromatography. Arrow indicates the AspBHI protein band, (b) Endonuclease activity assay on 
phage XP12 DNA containing 5mC. Three concentration of WT AspBHI (—0.57 pmoles, with 2-fold serial dilution) were used in the digestion. 
Mutant enzyme concentrations were estimated at 0.29 to 0.57 pmoles. The smearing may result from partial digestions of the phage DNA. We note that 
S41C protein tends to precipitate in conditions with <0.2 M NaCl. (c) Endonuclease activity assay on Dcm* and M.Hpall modified pUC19 DNA. 



Nevertheless, Loop-6C is a prime candidate for making base specific 
interaction in the major groove if the substrate DNA and/or protein 
undergo structural rearrangement during binding. 

S41A and S41C variants have altered cleavage activities. Substitu- 
tions of Ser41 by other amino acids drastically reduced enzyme 
activity (data not shown) except for the alanine (S4IA) and 
cysteine (S41C) replacements. These two variants showed 
somewhat different cleavage properties towards modified plasmid 
or phage DNA compared to the WT enzyme (Fig. 4, lanes 6-7): 
S41A cleaved phage XP12 DNA similarly to WT enzyme (Fig. 4b, 
lane 6), but barely cleaved pUC19 DNA, except for converting 
supercoiled DNA to nicked intermediate (only one strand cut) and 
linear form (one double-strand cut) (Fig. 4c, lane 6). S4IC 
demonstrated the opposite effect: it cleaved phage XP12 DNA 
much less efficiently than pUCI9. The phage DNA appears to be 
trapped by the S4IC protein precipitation (Fig. 4b, lane 7, the band 
near the top loading well), although it is not clear whether the bound 
DNA had been cleaved. 

To investigate the specificity of the S4IA and S41C variants, we 
used three 56-bp synthetic duplexes containing the symmetric 
sequence 5'-NC(5mC)GGN-3' (Fig. 5a), methylated on both 
strands. If the enzyme recognizes the top strand methylated site, 
cleavage on the 3' side N12/N16 away wiU result in two products of 
43-bp and 9-bp, both with a 4-bp overhang. We termed these pro- 
ducts as PI and P5 with averaged lengths of 45-bp and 11-bp 
(Fig. 5b). [The product P5 was not observed probably because it 
was too small to be stained or the small duplex (9 bp + 4 nt 
overhang) dissociated at 37°C after cleavage and the two short sin- 
gle-stranded oligonucleotides ran out of the gel.] If the enzyme recog- 
nizes the bottom strand methylated site, cleavage will result in two 
products of 39-bp (P2) and 17-bp (P4). And if the enzyme recognizes 
both top and bottom strand methylated sites, cleavage on both sides 



wiU result in three products of averaged lengths of 28-bp (P3), 17-bp 
(P4), and 11-bp (P5). The cleavage products were resolved using 20% 
native PAGE (Fig. 5b). The results indicate that AspBHI is capable of 
cleaving the substrates having a 5' pyrimidine base (T or C) (lanes 1 
and 7) but not a guanine (or adenine''): lane 4 of Fig. 5b only shows 
top strand (witha5' C) recognition products, PI andP5 (not visible), 
but not the bottom strand (with a G) recognition products P2 and P4. 

S41 A variant showed lower activity in cleaving all three substrates 
as a significant amount of full-length duplex oligonucleotides 
remained (Fig. 5b, lanes 2, 5 and 8). However, it appeared to prefer 
the S9 substrate, with the two 5' most positions being a C on both 
strands, compared with substrate S7 that has 5' T or 5' C on each 
strand (comparing lanes 2 and 8). This is in contrast to the WT 
enzyme that cleaved substrate S7 better (comparing lanes 1 and 7), 
suggesting a potential change of substrate specificity. On the other 
hand, an approximately equal amount of PI and P2 products were 
generated by S41 A on S7 substrate (lane 2), suggesting S7 might be a 
poor substrate for S41A, regardless of a 5' T or 5' C. The S41C variant 
had a digestion pattern similar to that of the WT enzyme. However, 
in addition to the predominant cleavage position at N12/N16 from the 
modified cytosine, S41C appears to have additional cleavage posi- 
tions (as marked with asterisk in lanes 6 and 9) - an observation 
previously observed as wobble cleavage''. 

Arg42 is essential for activity. A total of 19 variants R42X (natural 
amino acids other than arginine) were constructed by site-directed 
mutagenesis. All 19 variants were purified through nickel-chelated 
and heparin affinity chromatography. All were inactive in cleaving 
modified plasmid DNA, including the conservative Arg42-to-lysine 
substitution (data not shown). Arg42 might interact with the target 
5mC:G base pair (the only unambiguous base pair within the 
recognition sequence) during the initial protein-DNA encounter or 
stabilize the flipped 5mC via interaction with the orphaned guanine 
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staining. Inserted is a 10-20% gradient SDS-PAGE showing the proteins used for crystallization (Se-Met) and for activity (WT, S41A and S41C). NEB 
protein ladder was used as molecular weight markers. 



for enhanced recognition and tightening of the protein-DNA 
complex and thereby promoting cleavage. The precise way in 
which Arg42 and Ser41 mediate specific DNA recognition awaits 
the solution of a protein-DNA complex structure. 

Discussion 

The wide diversity of restriction enzymes'", from the smallest 
dimeric PvuII", to tetrameric Type IIP enzymes''", and the polymer- 
ized SgrAP', make them versatile tools for laboratory experimenta- 
tion, and fascinating subjects for studies of molecular architecture^^. 
Here we show structurally that the modification- dependent restric- 
tion enzyme AspBHI comprises two domains, one typically eukar- 
yotic and the other typically prokaryotic. The N-terminal part of 
AspBHI (residues 1-211) resembles an SRA-like 5-methylcytosine 
binding domain in structure and function. It recognizes 5mC within 
the specific DNA sequence context. The C-terminal part of AspBHI 
(residues 222-388) resembles a classic Type II restriction endonu- 
clease of the PD-(D/E)XK superfamily^'-^=. It is attached to the N- 
terminal domain by a 10-residue loop, and cleaves duplex DNA 
outside of the recognition sequence on one side, N12/N16 3' down- 
stream of the 5mC, somewhat like a Type lis restriction enzyme. 



Fokl, the best-known Type lis enzyme, has a similar domain 
organization comprising an N-terminal recognition domain and a 
C-terminal catalytic domain. It also recognizes an asymmetric 
sequence and cleaves downstream N9/N13, but there the similarities 
stop. Fokl is monomeric in solution and double-strand (ds) cleavage 
occurs by transient dimerization between the catalytic domains of 
neighboring molecules at least one of which is bound to a recognition 
gjjg26,27 Aspg]-ij (and MspjF), in contrast, assembles into a tetramer, 
even in the absence of DNA, with two centers for ds DNA cleavage 
(i.e. two catalytic-domain mediated dimers) and four 5mC-recog- 
nition domains. A complex model based on structural and biochem- 
ical evidence has been proposed for MspJF - and likely also applies 
to AspBHI - in which three monomers of the tetramer are involved, 
respectively, in binding modified cytosine, making the first 
proximal N12 cleavage in the same strand, and then making the 
second distal Nig cleavage in the opposite strand. In contrast to 
AspBHI, the N6-methyladenine dependent restriction enzyme 
Dpnl, comprises an N-terminal combined recognition and catalytic 
domain and a C-terminal non-catalytic DNA-binding domain^" 
(opposite of the domain arrangement of AspBHI and MspJI), and 
is monomeric. 
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The variety of restriction enzymes also makes them fascinating 
subjects for studying protein-DNA interactions among enzymes 
with a common basic function - highly specific DNA recognition 
and cleavage. Surprisingly, even for very well characterized restric- 
tion enzymes such as EcoRV^*^"^^, the mechanistic features that deter- 
mine specificity and selectivity are difficult to model on the basis of 
the available structural information^*^. Other than requiring a 5mC:G 
base pair, AspBHI is promiscuous in the bases it recognizes on either 
side of the modified cytosine: 5'-(C/T)(C/G)(5mC)N(C/G)-3'. For 
example, the 5' most base can be a thymine or cytosine but not a 
guanine (or adenine) (Fig. 5b). We attempted to relax specificity 
further on the 5' side of the 5mC by targeted mutagenesis of Ser41 
and Arg42, but we were unsuccessful. Arg42, which is not conserved 
among family members (Fig. lb), was found nevertheless to be essen- 
tial for enzyme activity, and all Arg42 mutants were inactive. Ser41 
mutants were likewise inactive except S141A and C. Interestingly, 
S41 A, which loses the ability to make hydrogen bonds, showed some- 
what different cleavage properties towards modified oligonucleotides 
with variation at the outermost 5' (C/T) position. Although consid- 
erable progress has been made regarding the mechanisms of action of 
restriction enzymes, many challenges remain, the most ambitious 
perhaps being the engineering of enzyme variants with new 
specificities. 

Methods 

All enzymes, plasmids and bacterial strains, if not otherwise specified, were obtained 
from New England Biolabs (NEB). Escherichia coU codon optimized AspBHI with an 
N-terminal 6xHis tag was cloned into a pUC19 derivative pZZl (Z. Zhu, NEB) 
between Ndel and BamHI sites*. Site-directed mutagenesis was carried out by inverse 
PGR using Vent® DNA polymerase and mutagenic primers designed with NEB in- 
house software. The entire alleles in AspBHI variants were sequenced to confirm the 
desired mutation. 

Protein expression and purification. Wild type (WT) and mutant AspBHI with N- 
terminal 6xHis tags were expressed in a Dcm-deficient E. coU strain T7 Express 
(C2566). Cells were grown at 30"C in 10 mL (small scale) or 0.5 to 1 L (medium scale) 
in LB + Amp to ODgoo 0.3-0.6 and induced with a fmal concentration of 0.5 mM 
Isopropyl j3-D-l-thiogalactopyranoside (IPTG). Induced cultures were grown 
overnight at 25 'C, harvested and then kept at — 20 'C. His-tagged proteins (small 
scale) were partially purified using Qiagen Ni-NTA spin kit as recommended by the 
supplier and used in the experiments shown in Fig. 4. For medium- scale production 
cells were lysed using sonication in 20 mM Tris-HCl, pH 7.5, 400 mM NaCl, 20 mM 
imidazole. Clarified cell extract was loaded over a gravity column using a Ni-NTA 
resin (Qiagen). Protein was eluted with 500 mM imidazole. Pooled fractions were 
then diluted by 10 fold in 20 mM Tris-HCl, pH 7.5, 20 mM NaCl and loaded over a 
5 mL Hi-Trap Heparin column using an AKTA FPLC machine (GE Healthcare). The 
proteins were eluted at —250-290 mM NaCl with a linear gradient of 20 mM to 1 M 
NaCl. Fractions containing AspBHI were identified on 10-20% gradient Tris-Glycine 
gels (Novex/Life Technologies) with the protein appearing as the major band (purity 
approximately 95%; Fig. 5b insert). Proteins were diluted to a working stock of 0.5- 
1 mg ml"^ and used in the experiments shown in Fig. 5b. 

Crystallography. For crystallization of AspBHI, 12 L of IPTG -induced E. coU 
cultures were harvested and the non-tagged enzyme was purified to homogeneity by 
chromatography through Heparin DM, Bio-Gel HTP hydroxyapatite. Mono Q, and 
Heparin TSK columns. Alternatively, further purification was performed via tandem 
HiTrap Q/SP (GE Healthcare) and a sizing column Superdex 200 (GE Healthcare). 
The position of the protein peak in the Superdex 200 column suggests the protein to 
be a tetramer (Fig. 2e). 

Final concentrations of the protein are between 6-20 mg ml" ' in 20 mM Tris-HCl 
(pH 8.0), 150 mM NaCl, 10% glycerol, 1 mM ethylenediaminetetraacetic acid 
(EDTA), and 1 mM dithiothreitol (DTT). Crystallizations were carried out by the 
hanging-drop vapor- diffusion method at 16"C using equal amounts of protein and 
well solutions. Conditions giving large and well -diffracting AspBHI crystals were (i) 
12% polyethylene glycol 3350 with 0.5 M K2HP04/Na2HP04 (pH 7.4) and (ii) 6-15% 
polyethylene glycol MME 5000, 5% Tacsimate (Hampton Research), and 100 mM 
HEPES (pH 6.2-7.4). The AspBHI crystal structure was solved by multi- wavelength 
anomalous diffraction phasing methods'*" using three datasets: a native AspBHI 
dataset, a Se anomalous dataset from a selenium -methionine (SeMet) labeled Leu228- 
to-Met (L228M) mutant crystal, and a Hg anomalous dataset from L228M mutant 
crystal soaked with —5 mM K2Hgl4 overnight (Table 1). 

AspBHI contains two methionines at residues 30 and 214 in addition to the N- 
terminal methionine. To increase the phasing potential of SeMet labeled crystals, we 
mutated Leu228-to-Met because other family members (Rial and LpnPl) have a 
methionine at the corresponding position (Fig. lb) and the mutant protein was 
utilized for phasing purposes. A total of ten Se atoms were found in the asymmetric 



unit of the selenium-methionine labeled crystal, three each for molecules A and B and 
two each for molecules C and D (L228M located in disordered C-terminal domains of 
molecules C and D were not detected). In the Hg derivative, a total of four Hg^^ atoms 
were found in the asymmetric unit, two of which reacted to Cys255 and Cys306 of 
molecules A or B. All the data sets were processed using the program HKL2000^', 
which calculated values of R^erge ^nd <I/aI> (Table 1). Phasing, map production, 
and model refinement were conducted using the PHENIX software suite'*^. The 
AutoSol Wizard*^ of PHENIX used RESOLVE** to carry out density modification and 
applied non-crystallographic symmetry (NCS) calculated from positions of heavy- 
atom sites^^, resulting in the multi isomorphous replacement with anomalous scat- 
tering (MIRAS) electron density map with superior quality compared to either single 
anomalous diffraction (SAD) map. Maps and model were visualized with COOT*^ as 
well as manual model manipulation during refmement rounds without the disor- 
dered C-terminal domains of molecules C and D. Individual thermal B-factors were 
refined only at the end stages of refinement, with the averaged root-mean- square 
deviation of 3.7 A^ for main chain atoms and 5.1 A^ for side chain atoms and did not 
vary significantly for any ordered domain of the modeled monomers. Distribution of 
averaged crystallographic thermal B-factor pre residue for the four monomers is 
shown in Figure IC, with the highest B-factors occur in the loops. 

DNA cleavage assays using methylated plasmids and phage DNA. Dcm^ pUCl9 
(100 i^g) was incubated with various methyltransferases (M.AluI, M.SssI, M.Haelll, 
M.Hpall, M.Hhal, or M.MspI) overnight at 37" C in the presence of 32 mM AdoMet 
(160 mM AdoMet for M.SssI) in a total reaction volume of 500 |j,L. Reactions were 
treated with 5 |,iL Proteinase K (10 mg ml"*) for 1 h at 37''C. Plasmids were then 
purified by spin column (Qiagen) and the DNA concentration was measured using 
the Nanodrop. 

For plasmid digestions, 100 to 300 ngof DNA was digested with 1-5 |ig of AspBHI 
(1 mg ml"') in NEB buffer 4 in the presence of 15 \xM of a self-annealed stem-loop 
activator (5' CT CCMAG GATCTTTTTTG A TCMTG GGAG-3 ' where M - 5mC)*. 
Adding an activator with the recognition sequence in trans can accelerate the slow 
reactions by the AspBHI family members'*. Titrations of AspBHI were done using 
dilution buffer (diluent B, NEB). Enzyme titration was carried out to make sure that 
the AspBHI concentration used in digestion was not inhibitory. Digestions were 
carried out for 2 h at 37 C and then treated with 2 |J.L proteinase K for 15 min. 
Digestion products were resolved and visualized after running on a 1% agarose gel 
(Figure 4). 

Phage XP12 DNA (bacterial host Xanthomonas oryzae) was a gift from Dr. Peter 
Weigele (NEB). XP12 phage particles were purified from lysate by CsCl gradient 
centrifugation and its DNA was further purified by phenol-CHCl3 extraction and 
ethanol precipitation. The phage DNA contains 5-methylcytosine, which serves as a 
substrate for modification-dependent restriction enzymes*^. The endonuclease 
digestion was terminated by addition of a loading dye with ethylenediaminetetraa- 
cetic acid (EDTA), sodium dodecyl sulfate (SDS), and glycerol. We used both XP12 
phage DNA (which is methylated at every cytosine) and 5mC-modified pUCl9 
(which is methylated at the specific sites) to corroborate the mutant activity. In 
general, most of the mutant activity is consistent on both substrates except for S41C as 
shown in Figure 4. 

Digestion of fully methylated oligonucleotides. Three sets of 56-base pair (bp) 
oligonucleotides containing NCMGGN (M — 5mC, N = A, T, C or G) was used for 
digestion as described'*: 

5'-CGGCGTTTCCGGGTTCCATAGGCTCCGCNCMGGNCTCTGATGAC- 
CAGGGCATCACA-3' 

3 ' -GCCGCAAAGGCCCAAGGTATCCGAGGCGNGGMCNG AGACTACT- 
GGTCCCGTAGTGT-5' 

Duplex oligonucleotide substrates (20 ng) were incubated with 0.5 |ig of AspBHI 
(WT, S41A, or S41C) in NEB buffer 4 with a final volume of 10 |iL at 37'C for 2 h and 
then treated with 0.5 |J.L proteinase K for 15 min. Digestion products were resolved 
on a 20% native TBE PAGE gel (Life Technologies), stained with Sybr Gold (Life 
Technologies) and visualized using a Typhoon 9400 imager (GE) (Fig. 5b). 
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